Preprint date: February 2, 2008 

Preprint typeset using LATg^X style cmulateapj v. 11/12/01 



A REDSHIFT SURVEY OF NEARBY GALAXY GROUPS: THE SHAPE OF THE MASS DENSITY 

PROFILE 

Andisheh Mahdavi^ 

Institute for Astronomy, University of Hawaii, 2680 Woodlawn Drive, Honolulu, HI 96822 

AND 



o 
o 

(N 

D 
tin 



> 
O 

o 

:^ 

Oh 

6 



X 



Margaret J. Geller^ 

Smithsonian Astrophysical Observatory, 60 Garden St., Cambridge, MA 02138 
Accepted for publication in the Astrophysical Journal 

ABSTRACT 

We constrain the mass profile and orbital structure of nearby groups and clusters of galaxies. Our 
method yields the joint probability distribution of the density slope n, the velocity anisotropy /?, and 
the turnover radius rg for these systems. The measurement technique does not use results from N-body 
simulations as priors. We incorporate 2419 new redshifts (included here) in the fields of 41 systems of 
galaxies with z < 0.04. The new groups have median velocity dispersion a — 360 km s^^. We also use 
851 archived redshifts in the fields of 8 nearly relaxed clusters with z < 0.1. Within R < 2r200) the data 
are consistent with a single power law matter density distribution with slope n =1.8-2.2 for systems 
with a < 470 km and n =1.6-2.0 for those with a > 470 (95% confidence). We show that a simple, 
scale-free phase space distribution function f{E,L^) cx {—E)"~^/'^L~^^ is consistent with the data as 
long as the matter density has a cusp. Using this DF, matter density profiles with constant density cores 
(n = 0) are ruled out with better than 99.7% confidence. 



1. INTRODUCTION 

The galaxies in a cluster account for less than 10% of its 
total mass. X-ray observations of the hot intracluster gas 
support the notion that most of the matter is distributed in 
a smooth, dark halo. It is of clear interest to use observa- 
tions to deduce not only the total mass, but also the struc- 
ture of this halo. For example, the logarithmic derivative 
of the density can be compared with N-body simulations 
of structure formation, allowing constraints on dark mat- 
ter physics. These simulations show that the inner regions 
of halos composed of coUisionless dark matter have power 
law density profiles p cx r~" with n ^ 1 — 1.5 (Navarro 
et al. 1997, NEW; Fukushige & Makino 1997; Nakano & 
Makino 1999; Moore et al. 1999). However, if the dark 
matter particles interact through the weak or the strong 
force, constant density cores develop within a fraction of a 
Hubble time (Burkert 2000; Dave et al. 2001; Balberg et al. 
2002). Measurements of the slope n in groups and clusters 
of galaxies are thus a potential a test of the coUisionless 
nature of dark matter. 

The methods for determining the masses of clusters fall 
into three broad categories: fluid dynamics, lensing, and 
stellar dynamics. The fluid dynamical methods, which rely 
on the X-ray properties of the intracluster medium, pre- 
suppose that the gas is either in hydrostatic equilibrium 
or in a time-independent cooling flow. Because measuring 
the gas temperature involves complex modeling of X-ray 
spectra, these methods offer at best ~ 50 resolution ele- 
ments across the brightest X-ray sources. Furthermore, re- 
cent X-ray observations indicate that the central regions of 
cooling flow clusters cannot be described self-consistently 
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by the standard models, leaving the shape of the mass 
profile within ^ 200 kpc of the cluster center uncertain^ 
(Markevitch et al. 1999; Soker et al. 2001; Fabian et al. 
2001). 

Lensing models detect either strong or weak gravita- 
tional deflection of the light from distant sources. Strong 
lensing allows direct estimation of the surface mass density 
from the spectacular but rare giant arcs, e.g. CL0024-t-16 
(Tyson et al. 1998). Weak lensing estimates depend on 
statistical reconstruction of the surface density from dis- 
tortions in the shapes of field galaxies (Clowe et al. 2000; 
Sheldon et al. 2001). These techniques are attractive be- 
cause they make no assumptions about the equilibrium 
state of the cluster. However, they could suffer contamina- 
tion from the large scale structure surrounding the cluster 
(Metzler, White, & Loken 2001). This contamination may 
account for the disagreement between X-ray and lensing 
masses, and makes the determination of the true shape of 
the density profile difhcult. 

Stellar dynamical methods grow out of the century-old 
tradition, beginning with Jeans and Eddington, of model- 
ing the structure of star clusters. Here, instead of address- 
ing a self-gravitating system of stars, one regards galaxies 
as point masses adrift in a larger sea of dark matter. The 
fundamental dynamical problem is then to calculate the 
spherically symmetric gravitational potential that causes 
the observed motions of the galaxies. Obviously, the anal- 
ogy to stellar systems is far from perfect; the ratio of the 
size of a galaxy to that of its host cluster is typically « 
15 kpc / 1.5 Mpc — 0.01, whereas for stars in a galaxy 
it is w 10^^^. As a result, interaction cross sections are 
much larger in galaxy clusters. Furthermore, although 
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stellar systems contain as many as 10^^ members, clusters 
rarely have more than Ri 500 luminous members, and the 
most abundant systems — groups of galaxies — have closer 
to w 30 (Carlberg et al. 1996; Mahdavi et al. 1999). 

Still, if correctly applied, stellar dynamics can trace the 
gravitational potential of clusters at the largest and small- 
est scales as well as the other available techniques. Mod- 
eling of spherical infall patterns (Geller et al. 1999; Rines 
et al. 2000) can map the mass profile outside « 5 Mpc. 
The equilibrium models wc consider arc sensitive to the 
shape of the dark matter halo in the innermost regions of 
clusters and groups. 

The most popular equilibrium techniques make use of 
the moments of the data to constrain the depth or shape 
of the cluster potential. For example, the virial theorem 
yields an estimate of the total mass of the cluster (Heisler 
et al. 1985; Biviano et al. 1993; Oegerle et al. 1995; Carl- 
berg et al. 1996; Girardi & Giuricin 2000): 
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where is the line-of-sight velocity in the center of mass 
frame (Danese et al. 1980), Zi is the redshift of the ith 
galaxy, Ri is its projected distance from the cluster cen- 
ter, c is the speed of light, and G is Newton's constant. 

The velocity moments may also be used to infer the 
structure of the potential in greater detail, using either an- 
alytic galaxy distribution functions (Kent & Gunn 1982) or 
the Jeans equation for a coUisionless stellar system (Fab- 
ricant et al. 1989; Binney & Tremaine 1987; den Hartog & 
Katgert 1996; Carlberg et al. 1997), 
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where r is the true three-dimensional distance from the 
cluster center, v{r) is the number distribution of the galax- 
ies, at{r) and (Jr{r) are the tangential and the radial ve- 
locity dispersions, and $ is the gravitational potential. To 
interpret the data, one usually chooses a form for <I>(r) and 
the anisotropy parameter /3(r) = 1 — ct|/ct^, and projects 
the solution to the Jeans equation to obtain az{R), the 
theoretical line-of-sight velocity dispersion profile. This 
profile may be compared with a real cluster by splitting 
the galaxies into radial bins and calculating oc ^ in 
each bin. 

There are several disadvantages to using the velocity 
moments to constrain $. First, the quality of the observed 
velocity dispersion profile (7z{R) is usually poor for clus- 
ters. Even with several hundred velocities, dividing the 
galaxies into radial bins produces a noisy profile that is 
not very informative about the radial variation of (T^. Sec- 
ond, even in the ideal limit of a perfectly observed az{R), 
vastly different combinations of $(r) and /3 can yield sim- 
ilar solutions to equation (4) (Binney & Tremaine 1987). 
Third, even if a unique solution is possible (e.g., by assum- 
ing a constant mass-to-galaxy ratio, oc z/), there is no 



guarantee that the solutions satisfy the requirement that 
the phase space density of the member galaxies be every- 
where positive or zero (van der Marel et al. 2000). Finally, 
it is desirable to avoid binning the velocities altogether in 
order to make as powerful a use of the scarce data as pos- 
sible. We therefore turn to maximum likelihood methods, 
which avoid some of the problems of the Jeans equations 
as applied to discrete systems (Merritt & Saha 1993; van 
der Marel et al. 2000). Constructing and max;imizing a 
suitable likelihood function guarantees a positive definite 
galaxy phase space distribution. 

Here we apply the maximum likelihood method to 
nearby (z < 0.1) systems of galaxies. Our goal is to derive 
joint constraints on $ and f3 for poor groups of galaxies as 
well as for rich clusters. To this end we conduct deep opti- 
cal observations of the RASSCALS X-ray emitting galaxy 
groups (Mahdavi et al. 2000). We also assemble a catalog 
of published redshifts in 8 nearby relaxed clusters of galax- 
ies (§2). Using these samples, we construct an ensemble 
group and ensemble cluster which serve to constrain the 
galaxy phase space distribution in each type of system 
(§3). This distribution then serves as a maximum likeli- 
hood estimator of the gravitational potential and of the 
orbital structure of the galaxy population. In particular, 
we calculate joint five- dimensional confidence volumes in 
the central anisotropy, central matter density slope, to- 
tal mass, matter core radius, and interloper fraction (§4). 
Wc discuss the implications of our work (§5) and conclude 
(§6). 

2. DATA 

Here we describe the data acquisition. First, we dis- 
cuss our observations of primarily poor groups of galaxies. 
From these data wc obtain a statistically complete sam- 
ple of 2419 optical spectra in the fields of 41 systems. We 
also assemble a redshift catalog of galaxies in relax;ed rich 
clusters from the literature. 

2.1. New Observations 

Our redshift measurement targets (Table 1) are drawn 
from Mahdavi et al. (2000), who cross-correlate the Center 
for Astrophysics Redshift Survey with the ROSAT All-Sky 
Survey (RASS) to construct the RASSCALS catalog. The 
RASSCALS are a statistically complete, redshift-selected 
sample of galaxy groups with at least 5 members brighter 
than a Zwicky blue magnitude mz = 15.5, comparable 
to a red magnitude rriR « 14.4. The groups lie within a 
redshift range 3000 < cz < 12000 km s~^, and have either 
X-ray luminosities or upper limits derived from the RASS. 
The diffuse X-ray emission coincident with the detected 
groups suggests that they are physically bound systems of 
galaxies. 

For this study, we select all X-ray emitting RASSCALS 
with right ascension 8 < Q!2ooo < 16, or Q!2ooo > 22 
and q;2000 < 2, and declination 62000 > ^8. We restrict 
ourselves to systems well away from the local superclus- 
ter, cz > 5100 km s~^, and attempt to ensure that they 
are group-like by limiting ourselves to X-ray luminosities 
10*2 < Lx < lO^^Koo erg s"^ We also omit the Hick- 
son (1982) compact groups 51 and 58. In addition to this 
93% complete sample of 30 X-ray emitting groups, we ran- 
domly select 11 groups without RASS X-ray emission, all 
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located within the same (a2ooo, 1^20007 ^) bounds. There 
are 41 systems in all. 

To constrain the variation of the velocity dispersion, 
and hence the mass, with radius, it is necessary to iden- 
tify group members to a large distance from the center of 
the gravitational potential. We therefore identify galax- 
ies brighter than ma « 15.4 within 1.5 Mpc of the center 
of all 41 groups as spectroscopic targets. We include all 
of these galaxies in our spectroscopic survey. Occasion- 
ally, we do not reobscrvc galaxies with a known rcdshift z 
such that \cz — cz\ > 4000 km s~^, where z is the redshift 
of the group as listed in Mahdavi et al. (2000). In other 
words, we sometimes ignore spectroscopically confirmed 
foreground and background galaxies. 

The total area of the sky covered by our spectroscopic 
survey is « 184 square degrees. Thus the photometric cal- 
ibration of plate-derived magnitudes is difficult, and the 
magnitude limit is only an estimate from the Palomar Ob- 
servatory Sky Survey (POSS) photographic plates, with 
a typical scatter of 0.5 mag from system to system. Be- 
cause we view the galaxies as test particles embedded in a 
gravitational potential, completeness is not as important 
as it would be if we wanted to measure the group lumi- 
nosity function. Nevertheless, our sampling uses a strict 
magnitude cutoff in the field of each group, with no biases 
against galaxies close together on the sky. 

We obtained spectra of the 2419 target galaxies with the 
FAST single object spectrograph (Fabricant et al. 1998) 
on the 1.5m Tillinghast Reflector at Whipple Observatory 
on Mount Hopkins, AZ. Using a 300 line mm~^ grating, 
a 3" slit, and a spectral coverage of 3940 A centered at 
5500A, we had a typical resolution of 4A. A data reduction 
pipeline incorporating the cross-correlation of the spectra 
with rest-frame galaxy templates (Tonry & Davis 1979; 
Kurtz et al. 1992; Kurtz & Mink 1998) gave redshifts with 
an average uncertainty of ±40 km s~^. Roughly 20% of 
these data were reported in Mahdavi et al. (1999); for 
completeness, we list all 2419 redshifts in Table 2. 

2.2. Clusters 

To compare the mass distribution of poor systems of 
galaxies and nearby rich clusters, we assemble a catalog of 
clusters close to dynamical equilibrium. This requirement 
is important, because the techniques we use below (and in 
fact all other methods relying on equilibrium dynamics) 
assume that the phase space distribution of the member 
galaxies is constant with time. Of course, at some level 
all clusters have residual substructure; our best hope is to 
examine those systems with the smallest departures from 
equilibrium. 

It is surprising just how rare equilibrium systems are. 
For example, of the 10 brightest objects in the ROSAT 
Bright Cluster Sample (Ebeling et al. 1998, BCS), only 
two clusters, Abell 1795 and Abell 2029, make it to our list 
in Table 3. Our criteria are not stringent: we require that 
published studies of the X-ray emitting gas and the cluster 
galaxy population lack clear evidence of ongoing mergers. 
For example a strong bimodality in the X-ray or optical 
light distribution is a signal that the system is not close 
to dynamical equilibrium. We also require a large number 
(> 70) of measured velocities in the field of each cluster; 
we search the NASA Extragalactic Database (NED) for 



published redshifts within 1.5 Mpc of the cluster center. 

The dynamical state of some of the clusters in our list is 
ambiguous. For example, the central dominant galaxy in 
Abell 2029 has a large peculiar velocity ~ 500 km s~^ 
in the center of mass frame (Oegerlc et al. 1995), but 
the cluster has a regular X-ray and optical morphology 
and shows no statistically significant substructure (Slezak 
et al. 1994; Oegerle et al. 1995). Many of the clusters we 
select contain cooling flows, which should arise only un- 
der equilibrium conditions (Allen 1998). However, not all 
cooling flow clusters have equilibrium galaxy populations; 
the most obvious example is Abell 85, which exhibits quite 
an irregular galaxy velocity structure (Durret et al. 1998). 
Abell 2199, another cooling flow cluster, is regular in both 
the X-ray and optical data within 1.5 Mpc. However, it 
is on course to collide with Abell 2197 and several other 
small galaxy groups located beyond 2 Mpc (Rines et al. 
2001). 

The clusters in our catalog have a mean rcdshift 0.04, 
and therefore complement the clusters of the Canadian 
Network for Observational Cosmology (Yee et al. 1996; 
Carlberg et al. 1996, 1997; van der Marel et al. 2000, 
CNOC), which are at z w 0.3 on the average. The two 
samples differ in character, however: at least six of the 
CNOC clusters exhibit obvious substructure or asymme- 
try in X-ray images (Neumann & Bohringer 1997; Lewis 
et al. 1999), and possibly also in weak lensing maps (Clowe 
et al. 2000). The CNOC mass determinations rely on the 
assumption that equilibrium techniques average over the 
substructure without biasing the final result. To a lesser 
extent, this assumption underlies our study as well; the 
absence of substructure still does not guarantee that the 
galaxy phase space distribution has arrived at a steady 
state. 

3. THE PHASE SPACE DIAGRAM 

In this section we isolate the group and cluster members 
from foreground and background galaxies. To conduct the 
dynamical modeling, we combine the systems into two en- 
sembles. 

3.1. Membership 

A redshift survey in the field of a particular cluster is 
always contaminated by foreground and background galax- 
ies. Some outliers in velocity space are easily identifi- 
able. Unrelated galaxies with velocities close to the clus- 
ter mean, however, are not as cleanly separated. Nor is it 
possible to identify a redshift range (-Zi, -22) which contains 
only the cluster members, because interlopers — unrelated 
galaxies with large peculiar motions — could well be pro- 
jected into this redshift range. 

We adopt a twofold approach to the membership prob- 
lem. Here we exclude the most probable foreground and 
background galaxies by the method of Zabludoff, Huchra, 
& GcUcr (1990). Then in §4.4 we follow van der Marel 
et al. (2000) in folding the existence of the remaining in- 
terlopers directly into our dynamical model. 

The Zabludoff et al. (1990) method consists entirely in 
making sure that in a sorted list of member velocities Vz,i, 
no two neighbors have Avz > cr^, where tr^ is the line- 
of-sight velocity dispersion of the cluster. This approach 
is not the most sophisticated method available. There is 
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Fig. 1. — Velocity histograms of the galaxies observed in the 41 groups. Heavily outlined histograms indicate interlopers identified by the 
membership algorithm (§3.1). Filled, hatched, and empty histograms represent member galaxies within 0.5, 1.0, and 1.5 Mpc of the group 
center. 



Shape of the Matter Density 



5 



an extensive literature on the use of nonparamctric adap- 
tive kernels to reconstruct the cluster velocity distribution 
(Pisani 1993). However, the interpretation of the smooth 
probability distributions produced by the adaptive kernels 
is not straightforward. In particular, the frequent occur- 
rence of slightly double-peaked probability distributions 
makes it difficult to tell exactly which galaxies should be 
considered cluster members. The Zabludoff et al. (1990) 
method is also nonparamctric given a rcdshift range, and 
it reports the cluster membership unambiguously. 

To implement this method, we begin with the known 
mean system velocity cz, from Mahdavi et al. (2000) for 
groups, and from NED for the clusters. We remove all 
galaxies in our survey with \cz — cz\ > 3000 km s~^ for 
each cluster. We then calculate z and ct^, given by 

1 ^ 

i=l 

where Vz,i, the ccntcr-of-mass velocity of each galaxy, is 
shown in equation (3). We sort the Vz,i, and remove all 
velocity-space neighbors with a velocity difference \Avz\ > 
(Tz- We repeat the procedure until no galaxy is rejected. 
Tables 1 and 3 list the final membership counts. This list 
almost certainly contains additional interlopers, which we 
treat probabilistically in §4. A total of 1428 of the 2419 
surveyed galaxies (60%) are group members. Of the 979 
additional redshift taken from the literature, 851 (86%) 
are cluster members. 

3.2. The Ensemble Systems 

The data for any given system are scarce; there are 
on the average ~ 35 members per system in the group 
database. It is therefore not possible to measure the shape 
of the matter density for individual systems. Our approach 
is to combine the groups and clusters into two ensembles. 
If the groups and clusters arc self-similar — i.e., they are 
differently scaled versions of the same archetypal system — 
the aggregate yields the maximum information about the 
shape of the matter distribution. If on the other hand the 
systems in each ensemble are not self-similar, the ensemble 
itself represents an average over the various matter distri- 
butions, and the derived results ought to reflect the range 
of shapes present in each ensemble. 

How are the systems to be combined? Ideally, both the 
velocity and the position of galaxies in each system ought 
to be scaled to a common value. We could choose a scal- 
ing such that the total virial mass of each system as given 
by equation (1) is the same: v = Vz/<7z, and R = R/Rh- 
However, interlopers make the radial scaling problematic. 
In practice, scaling by Rh introduces noise in the data 
because of its sensitivity to interlopers and substructure 
(Mahdavi et al. 1999). A more suitable scaling radius is 
r200) the radius at which the cluster density is 200 times 
the critical density of the universe. 

A common method for calculating r2oo, due to Carlberg 
et al. (1997), is to assume that the mass profile M(r) oc r in 
the vicinity of r2oo- Although observed X-ray and lensing 
cluster mass profiles are not proportional to r everywhere 
(Markevitch et al. 1999; Clowe et al. 2000), the hnear rela- 
tion does appear to be reasonable near r2oo- This assump- 
tion, together with the virial theorem (with Rh replaced 



by r2oo)) yields the following adopted scalings: 

Vz = VzjOz (6) 

= 10^ 

R = R/r2m (8) 
where Hq(z) is the Hubble constant at the redshift of the 
cluster. 

Using the velocity and radius scalings, we combine the 
groups and the clusters into two different ensembles — one 
"high (t" and one "low cr" ensemble. We do not combine 
all the data into one ensemble because lower mass systems 
of galaxies are thought to exhibit "similarity breaking," 
or departures from the scaling laws that pertain to rich 
clusters of galaxies. For example Ponman et al. (1999), 
Hclsdon & Ponman (2000), Mahdavi et al. (2000), and 
Mahdavi (2001) show that for low mass systems of galax- 
ies, the relationships between the X-ray luminosity Lx , X- 
ray temperature T, and velocity dispersion a differs from 
that of rich clusters. It would be interesting to see if cor- 
responding differences in the matter distributions of these 
systems exist. Some high resolution N-body simulation 
have reported differences in the shape of the matter distri- 
bution for low- and high-mass systems ( Jing & Suto 2000) . 

Another reason to break the data into two ensembles is 
completeness. In order for the analysis to be valid, the sys- 
tems assigned to an ensemble need to have redshift surveys 
complete to the same scaled radius. Because of the fixed 
physical limits of our survey (1.5 Mpc), low-cr (less mas- 
sive) groups are complete to a much larger value of r2oo 
than are high-cr (more massive) systems. To make efficient 
use of the data, it is best to have two different ensembles 
with different completeness limits. For this purpose it is 
useful to define N{R\i^) as the total number of galaxies 
within R\ini in all systems that are complete to i?iim- By 
maximizing N{R\ijn) as a function of Rum for each of the 
two ensembles, we ensure that we arc making the best pos- 
sible use of the data while adhering to the completeness 
requirement. 

Our goal is to have an equal number of galaxies in the 
low- and high-cr ensembles. The closest we can come to 
this goal is to split the systems at cr = 470 km s~^. Maxi- 
mizing A'^(i?iim) yields a low-cr ensemble with 893 members 
complete to Rum = 1.8r2oo, and a high-cr ensemble with 
945 members complete to Rum = 0.8r2oo- We therefore 
utilize 1838 (80%) of the 2279 available member galaxy ve- 
locities in our data sample. The two ensembles are shown 
in Figure 2. 

3.3. Statistical Properties of the EnsemMes 

Table 5 shows the statistical properties of the final dy- 
namical sample. We use the two sample KS test (Press 
et al. 1992) to evaluate the consistency of the high- and 
low-(T ensembles. The KS tests show that the distributions 
of scaled velocities Vz in the two ensembles arc consistent 
with each other. This is illustrated in the velocity his- 
tograms in Figure 3. 

On the other hand, the distribution of scaled distances 
Ri differs in the two samples. According to the KS test, 
the probability that the distributions are the same is less 
than 10~^. Note that for this comparison only, we trun- 
cated the low-cr sample at .R = 0.8, because that is the 
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hmit to which the high-cr sample is complete. Figure 3 
also shows the R histogram. The distribution of the dis- 
tances of high-(T galaxies varies more rapidly with R than 
that of the low-cr sample. 

Thus, although the galaxies in the high- and low-cr en- 
sembles appear to have similar velocity distributions, their 
spatial distributions differ slightly. In the next section we 
investigate whether whether this difference translates into 
a difference in the total matter distribution. 

4. DYNAMICS 

Here we discuss our technique for constraining the grav- 
itational potential and orbital distribution of a system of 
galaxies. We use elements of Cuddeford (1991), Mcrritt & 
Saha (1993) and van der Marel et al. (2000) in our analy- 
sis. We add our own procedures for the faster computation 
of the phase space integrals described below. 

4.1. Measurement Technique 

Our goal is to use measurements of galaxy positions and 
radial velocities (R, Vz) to constrain the shape of the mat- 
ter density. Our approach is Baycsian. For practical pur- 
poses, "Bayesian" means that our final result is a probabil- 
ity density rather a set of numbers and associated errors. 
For example, in addition to saying "the mass is M ± AM 
and the slope is n ± An," wc draw curves of equal prob- 
ability in M — n space. These "confidence regions" will 
not necessarily be elliptical, as would be the case for a 
straightforward linearized fit. 

Baycs's theorem specifics the method of calculating 
these confidence regions. Let the vector a represent all 
the parameters we wish to derive from the data. We seek 
the probability of a given the data, which, according to 
Bayes's theorem, is^ 

( ) ^ 

p{a\R,v.) = -^^l[p{Ri,v,,i\a) (9) 

The function on the left is the joint probability distribution 
of the unknown parameters a, such that / p{a.\R,Vz)da = 
1. On the right hand side: 

1. p(R, Vz) is independent of a. Because the left hand 
side has to be normalized to unity, this probability 
is a precisely calculable number. 

2. p(a) is the so-called Bayesian prior; it incorporates 
our prior expectations regarding the value of a. The 
priors are detailed in §4.5 below. 

3. p{Ri,Vz.i\3)dRdvz is the probability, given a physi- 
cal model a, of observing a galaxy with a projected 
distance from the center between i?j and Ri + dR, 
and with a line-of-sight velocity between Vz^i and 
Vz.i + dvz- 

Through Bayes's theorem the desired confidence regions 
can be derived by calculating the two-dimensional func- 
tion p{R,Vz\sl) for all a of interest, multiplying the result 
by the prior, and normalizing the answer to unity. 

To do this calculation we use the galaxy phase-space 
distribution function (DF hereafter). The DF, a general 

* For simplicity, we write the equations in this section in terms of R and 



way to describe the dynamical state of the galaxies in a 
cluster, is sometimes written /(r,v), where r and v are 
the position and velocity vectors, such that /d'^rd^v is 
the number of galaxies within a small phase space volume 
clement d'^r d^v at (r, v). If we knew the DF as a fimction 
of our parameter set a, it would be simple to write down 
the probability of observing a galaxy with line-of-sight ve- 
locity Vz at a projected distance R from the cluster center. 
We begin by referring to this probability as p'{R, Vz\a). 

Suppose we conduct a complete redshift survey of an 
isolated spherical cluster complete to a limiting projected 
distance Rum from the center. Then the probability of 
observing a given galaxy with radius R and velocity Vz is 

p {R,Vz\ a) = — — X 

where the R and (p denote polar coordinates in the plane 
of the sky, and z is the line of sight. The normalization 
A^o is the total number of galaxies within the limits of the 
survey, such that 

/ / p'{R,Vz\ai)dRdVz = l. (11) 

Jo J-oo 

The chief difficulty lies in calculating the DF as a func- 
tion of the dark matter potential and the galaxy orbital 
distribution. In other words, we must find the functions 
aL{R,z,VR,v^,Vz)- The probability represents an ide- 
alized observation of a cluster in the absence of contami- 
nants ("interlopers"); the true probability p, which takes 
interlopers into account, is calculated below (§4.4). 

We can begin by embedding this DF into our spher- 
ical gravitational potential $(r). Because $ is a neg- 
ative quantity by convention, it is helpful to redefine 
^ = $(oo) — <i>(r). Neglecting encounters, we assume that 
the only force on a galaxy is the Newtownian gravitational 
interaction v = V\I/. If we could write the DF in terms of 
the acceleration v and the potential ^, we would have the 
required link between the phase space variables and the 
gravitational potential. Such an equation would connect 
the observations to the matter distribution. 

The Jeans theorem (Binney & Tremaine 1987) provides 
this crucial link. The theorem guarantees that in a steady- 
state spherical system the DF is a function of only two in- 
tegrals of motion: the negative energy £ = — where 
= V • V, and the square of the angular momentum L^. 
Thus we can write 



f = f{£,L^)=f[^{r)-v^/2,\rxv\- 



(12) 



Newton's law and the Jeans theorem provide all the in- 
formation we need to calculate the probability of observ- 
ing a data point {R,Vz) given a DF and potential {f,'^) 
through equations (10)-(12). The Appendix contains the 
full expression for p'{R,Vz\sl) in spherical coordinates. 

Once we calculate the probability of the data given a 
specific model, wc can use Bayes's theorem to calculate 
constraints on the set of model parameters. 

Vz, rather than R and Vz- 
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Fig. 3. — Comparison of the low-cr sample (solid line) with the the high-tr sample (dashed line). Shown are the grand total velocity 
histogram (left) and the distribution of distances from the cluster center {right). For the latter plot, the low-cr systems have been clipped at 
0.8r200- 



4.2. Mass Model 

To begin we consider the parameters of the mass pro- 
file. Many forms exist in the literature, but the ideal pro- 
file for our purposes should fulfill the following criteria, 
in decreasing order of importance: (1) It should be gen- 
eral enough to have the inner matter density slope as a 
free parameter: (2) it should include well-studied profiles 
as special cases; and (3) it should generate a density and 
gravitational potential in terms of elementary functions. 
This last property would make phase space distribution 
analysis faster and more straightforward. 

We use a mass profile that fulfills all the three criteria: 

3— n 



M(r) = Mo 



(13) 



where Mq is the total mass of the system, ro is the char- 
acteristic radius, and n is the inner slope of the mat- 
ter density. This model is sometimes referred to as the 
"gamma-model" (not to be confused with 7 below) in 
stellar-dynamical literature (Dehnen 1993; Tremaine et al. 
1994). The density and potential pair generated by this 
mass profile are 



p{r) = 



Moro(3-n) 



$(r) = - 



47r 
GMq 
ro(2-n 



r-^{ro+ry 



1 - 



r + ro 



(14) 



(15) 



The above profile includes the well-known Hernquist 
(1990) and Jaffe (1983) models, which are generated by 
the n = 1 and n = 2 cases, respectively. The Hernquist 
profile is known to match the light and matter distribution 
in elliptical galaxies as well as in groups and clusters (Mah- 
davi et al. 1999; Rines et al. 2001). The formulation above 
is a generalization with arbitrary inner slope n. Note that 
the NFW model pnfw ocr~^(l-|-r)~^ has infinite mass 



and is not a subset of the models we consider. Using a 
method completely different from ours, Rines et al. (2003) 
find that samples as large 10** galaxies cannot distinguish 
between the NFW and Hernquist mass distributions in the 
infall regions of clusters (i? > 2 Mpc) . 

As a further boon, the generalized potential is invertible 
analytically: 



r($) 



l-[l + {n- 2)$ro/GMo]V(2-n) ' 



(16) 



further simplifying the calculation of phase space densities 
described below. 

In summary, we begin with a three dimensional parame- 
ter set a, containing the inner slope, the transition radius, 
and the normalization of the matter density. Three more 
parameters not directly related to the matter density are 
also required. They are the slope of the galaxy density at 
infinity, 7, the velocity anisotropy, /?, and the interloper 
fraction, Pj. 

4.3. DF model 

Next we examine the form for the DF, f{£,L'^). It 
is virtually a guarantee that given a potential ^E", many 
quite different / will describe the same data equally well. 
We therefore use a very simple DF — a scale free, separa- 
ble function of the energy and angular momentum (Fricke 
1952): 

f{£,L-') = h£''-^I^L-^^. (17) 

As we show below, this well-studied DF adequately de- 
scribes the cluster and group data. 

This DF yields a unique galaxy density profile v (dis- 
tinct from the matter density profile p). To calculate a 
galaxy density profile, we note that the integral of this 
particular DF over all velocities gives the particle space 
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density; see the Appendix for a detailed calculation of the 
integral: 



v{r)= [ f{£,L')d\ 



(18) 



cxr-2/3,I,(j.)"-/3+i. (19) 

This (3 parameter is equal to the orbital anisotropy of 
the galaxies: 



(20) 



In other words, ^1 — fi is the ratio of the velocity disper- 
sion in the tangential direction to the velocity dispersion 
in the radial direction. From both the equations above it 
is evident that f3 < 1. If that were not the case, (1) the 

galaxy mass J r'^i>{r)dr would diverge at r = 0, and (2) 
one of the velocity dispersions (T( and cr,. would have to be 
imaginary. 

The a and (3 parameters also describe the slope of the 
galaxy density at infinity. For example, if ^{r) oc 1/r at 
large radii, then u{r) oc , with 

7 = a + /3+l. (21) 

Henceforth, without loss of generality, we will discuss the 
energy slope of the DF solely in terms of 7. As 7 in- 
creases, the cluster contains fewer galaxies with large ki- 
netic energies. This behavior occurs because galaxies with 
energy £ ^ <^ begin to outnumber galaxies with f w 
(see equation 12). As a result the galaxy distribution is 
less extended than it would be with small 7. The mini- 
mum acceptable value is 7 = 4, because the mass models 
we consider (equation 14) decline asymptotically as r^^. 

The inner slope of the galaxy density rig can be different 
from the slope of the matter density n: 



2(3 

'^"-'^ 2(3 +{n- 



(n < 2) 
2)(7-2/J) (n>2) 



(22) 



The reason for the split at n = 2 is that the potential <f> 
becomes singular for n > 2, and through equation (19), 
gives a contribution to the inner slope of the galaxy den- 
sity. For n < 2, the potential is constant at r = 0, and 
there is no such contribution. A corollary of this property 
is the requirement 

n > Ug. (23) 

This statement specifies that the galaxy density may 
never exceed the total matter density. It implies, impor- 
tantly, that for constant density cores n = 0, no radially 
anisotropic models (/? > 0) are viable. This requirement 
is not unique to our model; a large number of other radi- 
ally anisotropic models, with substantially different DFs 
and anisotropy profiles P{r), are also incompatible with 
constant-density cores (van der Marel et al. 2000; Cudde- 
ford 1991). 

Up to this point we have a five-dimensional parameter 
space a = (Mo,n, ro,7,/3). One final dimension is neces- 
sary, the interloper fraction. 

4.4. Interlopers 

So far we have assumed that the clusters and groups 
we observe exist in isolation, free from any contamina- 
tion from background and foreground galaxies. In reality, 
however, even after velocity clipping (see §3.1), a small 



percentage of the galaxies arc unrelated objects projected 
into the phase space volume of interest. We wish to es- 
timate this percentage — the interloper fraction Pj — from 
the data. 

Wc follow the elegant method of van der Marel et al. 
(2000) in treating the interlopers statistically. This 
method assumes that interlopers land randomly in the 
group or cluster phase space volume. Thus every galaxy 
we observe has a finite probability Pj of being an unrelated 
interloper. The probability that a galaxy with velocity 
and projected distance R is included in our survey is then 

p{Ri,v,,i\sL) = (1 - Pi)p'{Ri,v,,i\ai) + (24) 

Where |wiim| is the maximum observed line of sight veloc- 
ity. It is easy to verify that the above expression meets 
a normalization criterion similar to equation (11) as long 
p'{R,Vz\a) ^ as ^ \viim\- 
f-Riim r\vii„i\ 

/ p{R,V;,\a)dRdv^ = l. (25) 

We can now write down the final expression for the six- 
dimensional function p(a|R, v^). The full expression is too 
long to be listed here and is derived in detail within the 
Appendix. The derivation consists of a straightforward if 
laborious combination of equations (10) - (24). The main 
complications occur because wc observe a spherical object 
in cylindrical coordinates. 

We henceforth refer to the joint probability distribution 
of observables for a given measurement simply as p{R, Vz), 
where without loss of generality we have made the substi- 
tution R R/r2oo = R and v^/a^ = v^- 

4.5. Reparam,eterization and Bayesian Priors 

The six parameters of our model are (n,Mo,7,ro,/3,P/). 
Some of these parameters can be redefined without loss of 
generality to yield more compact joint confidence regions. 
For example, we find that jointly constraining n, Mq and 
ro as defined in equation (13) yields a high degree of corre- 
lation among all three parameters. However, by switching 
to the system 

r5/2 = (26) 

M200 = M(r2oo), (27) 
we significantly reduce the overall extent of the confidence 
regions. The new radius scaling r5/2 is defined such that 



dlnp 



d\nr 



(28) 



In other words, r,^/2 is the location at which the slope of 
the matter density is 5/2. Similarly, M200 is the total mass 
contained within r2oo- 

The fact that we analyze ensembles of clusters has impli- 



cations for the measurement of 



•5/2 



and M200- To create 



the ensembles, it was necessary to scale the radii and line- 
of-sight velocities by r2oo and a, respectively (see §3.2). 
Because of this scaling, the mass and characteristic radius 
of the ensembles are normalized as well. As a result, our 
method constrains the dimensionless quantity 

M200 = ^ (29) 
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for each of the two the ensembles. Similarly, the quantity 

^5/2 = ■r5/2/r-200, (30) 
rather than r-^j-i alone, is constrained by our method. 

A redefinition of the anisotropy is also appropriate. Be- 
cause the allowable range in /3 is [— oo,l], we define, simi- 
larly to Wilkinson et al. (2002), 

/3=- In (1-/3). (31) 
This redefinition maps the allowable range from [— oo,l] 
into [—00,00], and ensures that the parameter space has 
equal prior probability density in the radially and tangen- 
tially anisotropic regimes. 

The final parameter set is (n,M20o/5/2;7) P,Pi)- In our 
analysis, we take the Bayesian priors p(a) (equation 9) for 
these parameters to be uniform. 

4.6. Computation Technique 

Here we describe the numerical procedure used to calcu- 
late our probabilities. Even though ]3(i?, v^) has six dimen- 
sions, wc limit ourselves to constraining five at any given 
time. As we explain in §5.2, for free n our data can place 
only a lower, and not an upper limit on tlic^ energy slope 
7, and therefore p(i?, Vz) cannot be normalized properly 
as a function of 7. On the other hand, if we fix n, then 
we can easily constrain 7. For this reason, we conduct the 
entire procedure described below several times: (a) once 
with n fixed and 7 free (§5.1), and (b) many times with n 
free and 7 ranging over several discrete values (§5.2). 

We use a computation technique that yields an estimate 
of p{R,Vz) on an grid with elements: 

1. Tabulation. There are two scaling parameters, the 
transition radius r5/2 and the mass normalization 
M200- We define two dimensionless parameters, 
R' = R/r5/2, and v'^ = i^^ro / (G A'hoa) ■ Then wc 
tabulate p(i?',ti'|a) on a refined M x M grid for 
different values of (3, and n. This step therefore re- 
quires m'^M'^ evaluations of the triple integral given 
in equation (10). This phase occupies roughly 25% 
of total computation time. 

2. Interpolation. For each of the A^o data points 
{Ri.,Vz,i), we use cubic spline interpolation to calcu- 
late the probability density as a function of /3, M200, 
r5 /2 , and either n or 7. This step requires Nom^ in- 
terpolations, and takes up roughly 70% of the total 
computation time. Because of the linearity of the 
interloper parameter in equation (24), calculating 
the probability density as a function of Pi takes up 
a trivial amount of time in comparison to the Noiri^ 
interpolations. 

3. Projection. To display the 5-dimensional probabil- 
ity density, we produce two-dimensional realizations 
of it. This means creating 5x4/2 =10 separate 
probability distributions, each representing a sum- 
ming (or "marginalization" ) of p(a|R, v^) along a 
cube orthogonal to the other 9. For example, to 
obtain the joint probability distribution of the mass 
normalization M200 and inner slope n, we use 

p{M200, n, f5/2, 0, Pi)df 5/2 dpdPi. (32) 



This step is conducted simultaneously with step (2) 
and takes up a negligible amount of the total com- 
putation time. 

4. Goodness of fit. To check the goodness-of-fit of the 
physical model, we bin the data in two dimensions 
and use tests to compare the number of galaxies 
in each bin to the number predicted by the most 
likely model. We bin the data only to check the 
goodness of fit, not to derive constraints on the pa- 
rameters. We calculate the quantity 



(33) 



where k is the total number of bins, Nj is the num- 
ber of galaxies observed in bin j, and pj is the in- 
tegrated probability of the best fit model in that 
bin. The quantity in the denominator is the Pois- 
son variance in each bin. 

It can be shown that for our case is bounded 
above by a Xfe-i distribution and below by a xt-s-i 
distribution (Lupton 1993), where s = 5 is the num- 
ber of parameters we fit. Therefore, if 1 — g is 
confidence with which we can reject the hypothe- 
sis that the data is consistent without our model, 
then q must lie between T[{k — s — l)/2, A^/2] and 
r[(fc - l)/2, A'^/2], where r[a, a;] is the incomplete 
gamma function. We always use the value that 
yields the smaller q. 

We create rectangular bins in R,Vz space so that 
they each contain 30 galaxies and so that J2Pj — I- 
Let t be the largest integer smaller than or equal to 
^/NqJSO. There are t strips in the radial direction, 
each with 3Qt galaxies. The first t—1 radial strips 
are divided into t bins each containing 30 galaxies; 
the last radial strip is divided into t—1 bins each 
containing 30 galaxies, and a final bin containing 
the remaining galaxies. Therefore, the total num- 
ber of bins is k = t^. 

For our final calculations, we use M = 50 and m = 40. 

4.7. Simulations 

The ideal procedure to test the validity of our technique 
would be to create thousands of Monte Carlo realizations 
of various clusters with different values of /?, 7, ro, M200, 
and Pi, and to conduct our analysis separately for each 
realization. Then we could compute the fraction of simu- 
lations in which the most likely parameters closely match 
the input parameters. Unfortunately, such a test would be 
prohibitively expensive, requiring many months of CPU 
time. 

As a compromise, we conduct four tests in which we 
attempt to model simulated ensembles. In these simula- 
tions, we draw 893 members from the p{R,Vz) for a set 
of parameters given a priori. The procedure for draw- 
ing deviates distributed according to p{R, v^) is the rejec- 
tion method described in Press et al. (1992). The rejec- 
tion method uses a comparison function that is everywhere 
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greater than p{R,Vz) and has an analytic cumulative dis- 
tribution. A large number of deviates are drawn from the 
comparison function, and those that lie beneath p{R,Vz) 
have; the desired probability distribution. Wc extend the 
Press ct al. (1992) technique to two dimensions by using a 
constant comparison function in R,Vz- Finally, we maxi- 
mize the Bayesian likelihood by conducting the entire pro- 
cedure described in the previous section. The simulated 
data are drawn from models with 7 = 8. We first analyze 
the simulations assuming 7 = 8, and then we consider the 
effects of varying 7. 

Table 4 lists the parameters used for the simulations, 
and Figure 4 shows their properties. The phase space 
distribution generated from each parameter set appears 
together with the grand total velocity histogram, 

N{vz)= r^'^ p{R,dz)dR (34) 
Jo 

and the surface number density 

/oo 
p{R,Vz)dVz/i2TrR), (35) 
-00 

where Nq is the total number of galaxies in the sample. 
In these figures we also show the most likely models for 
the above two observables. Note that these curves are not 
fits, but predictions of the unbinncd maximum likelihood 
technique described in the previous section. 

According to the simulations, our estimation technique 
is able to tell apart fundamentally different models with 
similar binned statistics. For example, simulation A and 
D have similar N{vz) and S(Ji), even though A was gener- 
ated with n = 1 and D was generated with n = 0. Because 
we analyze unbinned velocities and radii, our technique 
yields a 95% confidence region n =0.6 1.3 for A and n =0 
0.7 for D. The unbinned estimator is able to distinguish 
the parent distributions of the two data sets. 

Another feature of the simulations is that they are more 
powerful when n is steep — that is, the slope of the matter 
density is large. In this case our method yields a fairly 
accurate measurement of n. When n is close to — that 
is, the matter density has a flat core — then the measure- 
ment method yields a wider range in n. Nevertheless, this 
range for both models C and D includes the correct n = 
solution. 

We can also use the simulations to test the validity of 
our numerical integration. The integrals of N{vz) and 
2itRTj{R)/No should equal 1. For the four simulations, 
we find that / N{iz) = 0.9976,0.9880,0.9998,0.9987, and 
/ 2TrRE{R)/No = 0.9987,0.9972,1.0001,0.9995. Thus in 
most cases our accuracy is better than 0.5%. 

Finally, it is useful to examine how sensitive the mea- 
surement technique is to variations in 7. In analyzing our 
simulations, we have set 7 = 8, i.e., we have fixed it at 
the value used to generate the data. Now we reanalyze 
the data assuming different values of 7, from 4 to 12. We 
rerun our analysis procedure, calculating the quality of fit 
q and the 95% marginalized confidence region in n as a 
function of 7. The results are shown in Figure 5. We find 
that as long as g > 0.1, i.e., the fit is statistically accept- 
able, then the derived confidence regions ngt are close to 
the true n for most values of 7. In general, choosing the 
smallest value of 7 that gives q > 0.1 results in the most 



accurate confidence intervals on n. Choosing larger values 
of 7 yields confidence regions nat that are slightly different 
from the correct values. 

While not exhaustive, these simulations suggest that our 
measurement technique is triistworthy. The 95% confi- 
dence contours derived using our method should include 
the correct value of each parameter if the general model is 
admissible (i.e., q > 0.1), with the caveat that the small- 
est value of 7 that yields a good fit should be used. These 
confidence regions ought to be compact for large values of 
the density slope n, and broad- valued for flat halos with 
small n. 

5. RESULTS 

Here we describe our constraints on the inner matter 

density slope n, the mass normalization M200, the transi- 
tion radius f5/2, the velocity anisotropy /3, and the inter- 
loper fraction P/. It bears repeating that our constraints 
result from maximization of the likelihood function de- 
scribed above, using the full unbinned data set. The 
binned profiles we show are for illustrative purposes and 
do not indicate ordinary fitting. 

5.1. Models with Constant- Density Cores 

First we investigate the possibility that the ensembles 
have constant-density cores (n = 0). In §4.3 we showed 
that such cores do not support radially anisotropic mod- 
els; we are limited to /? < 0. Therefore the minimization 
occurs over five dimensions: (/?,7,f5/2,Af200i-P/)- 

The dotted line in figure 6 shows the results of the min- 
imization. We show the most likely grand total velocity 
histogram and galaxy surface density as given by equations 
(34)-(35). 

We find that matter profiles with constant-density cores 
do not produce a galaxy density profile as steep as our 
data. Although the grand-total velocity histogram N{vz) 
seems accurately reproduced, the model's galaxy density 
in the innermost regions is too small. The constant den- 
sity models are disfavored with q = 0.003 for the low-cr 
ensemble, and 10~^ for the high-cr ensemble. 

Note that the inconsistency with the galaxy surface den- 
sity S is not the only discrepancy bctwec;n the data and 
the n = model. Our goodness-of-fit method described 
above shows that the n = models predict too many high- 
velocity galaxies at intermediate radii. This discrepancy is 
not easily visuahzed through the plots of N{vz) or T,{R). 

Just to be sure that the n = models are a poor fit, 
we also try relaxing the requirement that the orbits not 
be radially anisotropic. We fit the full range of negative 
to positive /3, and find that /? w 0.25,7 ~ 4 maximize the 
likelihood for both ensembles. These values are formally 
unphysical, yielding a galaxy density that is steeper than 
the n = total matter density near the central regions. 
However, even these fits are rejected with q = 0.02 for 
groups and q = 0.003 for clusters. 

It is interesting to note (Figure 6) that the inconsistency 
occurs near r = 0, and not at larger radii. If our choice of 
mass model (equation 14) were the cause for the poor fit, 
we would expect the inconsistency to occur at the larger 
radii, because that is where we impose a p oc behav- 
ior. We conclude that models with a constant density core 
(n = 0) are at best barely consistent with the data. 
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Fig. 4. — Simulations of the technique: (top) phase space distributions generated using the four parameter sets in Table 4; (middle) the grand 
total velocity histogram N{vz); (bottom) the galaxy surface density. The solid lines show predictions (not fits) of the maximum-likelihood 
analysis of unbinned distance and velocity data. The "breaks" in S(i?) around R ~ r200 in simulations A and B are due to contamination by 
the interloper population Pj. 
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Fig. 5.- Testing the robustness of the simulations as a function of 7: {top) the 95% confidence regions on the best-fit n minus the true n 
for each simulation; (bottom) the quality of fit as a function of 7. 
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Fig. 6. — Predictions (not fits) of the maximum-likeUhood analysis of unbinned distance and velocity data. Shown are the surface number 
density (top) and the grand total velocity histogram (bottom). The solid line represents the most likely model overall, whereas the dashed 
line shows the most likely model when the inner slope of the matter density is forced to equal 0. The latter model is rejected with 99.7% or 
better confidence. The dotted vertical line represent the completeness limit of the high-u ensemble. 
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5.2. Models with Fixed 7 

Now we allow the inner slope of the mass density n to 
vary, maximizing p(R,Vz\a.) for several different values of 
7, the slope of the galaxy density at large radii. The jointly 
constrained parameter set is therefore {n,f3,r^/2, M2oo,Pi)- 
As described in §4.3, 7 is also related to the slope of the en- 
ergy term in the distribution function (equation 12). The 
larger the value of 7, the more peaked the distribution of 
galaxies with small kinetic energies. 

Our data can provide a lower, but not an upper limit on 
7. For both the low- and the high-c ensembles, the qual- 
ity of the fit increases with 7, asymptotically approaching 
a fixed value as 7 — > 00. Our simulations (§4.7) show a 
similar effect, and suggest that the smallest value of 7 that 
gives an acceptable fit is likely to give the most accurate 
results. Therefore, instead of constraining 7 continuously, 
we list the minimum value of 7 that yields a quality of 
fit q > 0.1. This constraint corresponds to 7 = 8 for the 
low-cr and 7 = 12 for the high-o" ensemble. 

Figures 6 and 7 show the results. Aside from difference 
in the value of 7, the high- and low- velocity dispersion 
systems provide very similar fits: the galaxy orbits are 
consistent with isotropic (/3 = 0), and both ensembles have 
n « 2 and transition radius r5/2 much greater than r2oo- 
In other words, the matter distributions for both samples 
are consistent with a singular isothermal sphere within the 
limits of the survey. 

5.3. Discussion 

Neither the cluster nor the group data supports a total 
matter distribution with a constant-density core (n = 0). 
This result is in keeping with recent studies of more dis- 
tant clusters in the X-ray (Lewis et al. 2003; Arabadjis 
et al. 2002; Allen et al. 2002) as well as weak lensing maps 
(Gavazzi et al. 2003; Clowe & Schneider 2002; Clowe et al. 

2000) . Optical observations of cluster velocity data have 
also yielded similar results. For example, Biviano & Gi- 
rardi (2003) uses the Jeans equation and isotropic velocity 
dispersion profiles to rule out constant-density cores in the 
Two Degree Field Galaxy Redshift Survey (CoUess et al. 

2001) . 

Our results suggest that within a projected region R < 
2?"200) the matter distribution in both low- and high-cr sys- 
tems of galaxies is close to a single power law with n = 2. 
The density declines rapidly only outside 2r2oo- This re- 
sult is consistent with N-body simulations. For example, in 
the original paper describing the sinmlations of NFW, the 
density profiles of dark matter halos are consistent with a 
single power law of slope 2 between 0.1r2oo and 2r2oo (see 
their Figure 3). The only discrepancy, or flattening, oc- 
curs at radii smaller than 0.1r2o0) where our datasets have 
56 (low-a) and 131 (high-cr) members, perhaps insufficient 
to provide a robust constraint. 

A major limitation of our method is the fixed form of 
the galaxy distribution function /(£, L^). By modeling the 
DF as a power law in energy and angular momentum, we 
neglect all other possible forms for the DF, some of which 
may indeed be consistent with an n = model. On the 
other hand, the tests we perform reassure us that the 
data are at least statistically consistent with cuspy matter 
distributions. 

The work of van der Marel et al. (2000), whose methods 



we adapt and extend, involves a similar analysis, and it is 
instructive to compare the two different approaches. In- 
stead of fixing the form of the DF, they fix the form of the 
surface number density S(i?), and calculate the DF using 
an inversion of the integral in equation (18). As a result, 
their DF is more general than ours, / oc g{£)L~^^, with 
g being the result of the inversion. Their method has the 
advantage of generality, but the disadvantage that it re- 
quires a two-stage fit: first S(i?) must be fit and fixed, and 
then f3 and n measiircd separately, without an indication 
of how changes in the S(-R) fit would affect the resulting 
constraints. Our method sacrifices generality by setting 
g{£) = £°'~^^'^, but gains the advantage that all but one 
of the parameters are fit simultaneously (7 is varied dis- 
cretely). As a result, the correlation among the parameters 
is easy to understand via marginalized confidence contours 
(Figure 7). 

Another difference between our work and van der 
Marel et al. (2000) is the sampling of the parameter 
space. We constrain the continuous five-dimensional re- 
gion (n, (3, Mo, ro, Pf), whereas van der Marel et al. (2000) 
sample discrete points within that space. They find that 
an NFW profile with n ~ 1 matches the CNOC data, but 
they do not consider n = 2 models; we find that our clus- 
ters and groups have steeper matter distributions, with 
the best fit model closer to n = 2. 

The fixed functional form of our DF can also explain 
the large transition radii. Careful examination of equa- 
tion (19) shows that increasing 7, the slope of the galaxy 
density as r — > 00, also steepens the slope of the galaxy 
density Ug near r = 0. It is still the case that Ug = 2(3 
at r = 0, but the derivative of the slope dug/dr becomes 
larger and larger as 7 is increased. Our data have a steep 
galaxy density slope near r = 0, and there are only two 
ways to fit this: increasing /3 (and thus generating more 
radially anisotropic orbits) or increasing 7 (and thus steep- 
ening the overall galaxy energy distribution and hence the 
galaxy density). This is where the velocity distribution 
enters. As Figure 6 shows, the grand total velocity his- 
togram is not strongly peaked a,t Vz = (in fact, it is 
consistent with a Gaussian distribution) . Dramatically in- 
creasing P would yield a velocity distribution that is much 
more sharply peaked than the data. Thus the only choice 
left the model is to increase 7 to larger values, > 8. How- 
ever, the slope of the galaxy density is not as steep as 8 at 
the limits of the survey. Hence, a large transition radius 
ro is required to fit the outer regions of the galaxy number 
density profile. At the same time, the velocity data al- 
low the matter profile to be nearly a power law within the 
region constrained by the data. Note that the upper lim- 
its on r^^2 are possible because if becomes too large, 
then the galaxy density profile begins to resemble a single 
power law as well, and the position data do not favor this 
limit (Figure 6). 

Another shortcoming is that our method constrains the 
total matter density, rather than the dark matter density 
alone. While the dark component is thought to dominate 
the mass, it does not do so overwhelmingly — in typical 
clusters, ^ 25% of the mass is in baryons, chiefly in the 
form of the X-ray emitting medium. In the innermost 
regions of clusters with a dominant central galaxy, it is 
actually the stellar mass density and not the dark mat- 
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Flc;. 7. — Results of the likelihood maximization for free n. Solid contours represent the low-cr ensemble and dashed contours represent the 
high-cr ensemble. Shown are marginalized joint 68% and 95% probability contours in the matter density slope n, normalized anisotropy /3, 
mass normalization M20O) transition radius ^'^d interloper fraction Pj. 
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ter density that dominates the total distribution (Koop- 
mans & Treu 2003). Lensing and stellar dynamical models 
which take into account the starlight separately from the 
dark component show conflicting results — while the cen- 
tral regions of some clusters exhibit flat n = (Sand ct al. 
2002) dark matter densities, others show evidence for steep 
n 2 dark matter profiles (Davis et al. 2003). 

It would be of great value to constrain the inner den- 
sity slope n independently of the slope at infinity. Unfor- 
tunately, because the data are sparse, we cannot derive 
true constraints on the outer galaxy density slope 7, and 
measuring the outer total matter density would be even 
more difficult. With 10^ redshifts a true, simultaneous 
constraint on the outer and inner slopes would be possi- 
ble; such large data sets could allow a direct measurement 
of the phase space distribution function (Merritt 1993), 
without requiring us to parameterize the shape of the dis- 
tribution function, as in equation (12). Data sets that are 
only factors of two larger are unlikely to make that differ- 
ence. Also, leveraging independent constraints on the in- 
ner and outer slopes by extending the survey beyond r2oo 
is unlikely to provide better constraints. In these outer 
regions, most of the galaxies arc experiencing spherical in- 
fall, which makes equilibrium models clearly unapplicable, 
and caustic modeling techniques preferable (van Haarlem 
et al. 1993; Diaferio 1999; Geller et al. 1999; Rines et al. 
2000). However, one can envision a technique that com- 
bines equilibrium dynamics at small scales with the study 
of the infall caustics outside the virialized regions of the 
cluster. In this way one could obtain a total mass profile 
over several decades in r. 

The most promising method of increasing the accuracy 
of this analysis is to extend the known membership of 
groups and clusters to dwarf galaxies with absolute mag- 
nitudes as faint as Mr = — 11. A typical group with ve- 
locity dispersion a w 300 km s"""^ is likely to have » 60 
faint dwarf members within 1 Mpc (Trentham & TuUy 
2002). A large spectroscopic survey of well-selected galax- 
ies brighter than rriR = 22 in our group sample would 
likely yield triple the current membership count and allow 
for much more robust constraints. 

6. CONCLUSION 

We compare the distribution of matter in rich and poor 
systems of galaxies, basing our analysis on a deep catalog 
of galaxies in groups. The catalog consists of a statistically 
complete sample of 2419 newly measured redshifts in the 



fields of 41 systems, as well as 979 redshifts from the lit- 
erature for 8 nearby rich clusters of galaxies. Most of the 
groups have X-ray emission detected in the RASSCALS 
survey (Mahdavi et al. 2000). 

We construct a low-cr ensemble group and a high-cr en- 
semble cluster from these data. The final dynamical sam- 
ples have 893 members in 33 groups, and 945 members in 
15 clusters, with an average velocity dispersion of 330 km 
s~^ and 800 km s~^ respectively. 

Our study is the first to derive confidence volumes in the 
parameter space defined by the matter density slope n, the 
interloper fraction P/, the mass normalization M200, the 
velocity anisotropy /?, and the characteristic radius tq for 
clusters and groups of galaxies. By modeling the phase 
space density of the galaxies directly, we place much more 
stringent bounds on the mass and orbital structure of these 
systems than would be possible through the virial theorem 
or the Jeans equations. However, our results are limited 
by the fact that we force the galaxy distribution function 
(DF) to be a power law in energy and angular momentum, 
neglecting all other forms. 

We find that neither the clusters nor the groups are con- 
sistent with a matter distribution that has a fiat, constant- 
density core (n = 0). Instead, within R < 2r200) both the 
low-cr and the high-cr systems are consistent with a single 
power law with slope n « 2. The models require both 
the matter and the galaxy distribution to decline steeply 
outside 2 'r2oa- It is conceivable that choices of DF dif- 
ferent from ours could yield substantially different results; 
however, the test cannot reject the possibility that our 
model is correct. Our results are consistent with cold CDM 
simulations that predict centrally divergent matter distri- 
butions. 
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48 


26 


7345+50 


265+30 


42.44+0.17 




SS2b239 


13:49:10.8 


-07:18:11.7 


69 


24 


7329+59 


311+29 


42.38+0.26 


HCG 67 


NRGb302 


14:28:33.1 


+ 11:22:07.7 


53 


31 


7892+58 


324+32 


42.05+0.59 




NRGs317^ 


14:47:09.8 


+ 13:42:23.4 


39 


19 


8898+72 


338+38 


42.51+0.30 




SRGbOOQ" 


22:14:48.0 


+ 13:50:17.5 


46 


35 


7775+57 


349+33 


42.44+0.26 




SRGbOia*^ 


22:50:00.7 


+ 11:40:15.6 


55 


30 


7689+101 


578+76 


42.45+0.21 




SRGbOie 


22:58:14.2 


+25:56:13.0 


58 


43 


7394+46 


327+33 


< 41.84 




SRGb037 


23:28:46.6 


+03:30:49.1 


85 


20 


5221+86 


413+59 


41.91+0.44 




SS2b312 


23:47:24.0 


-02:19:08.4 


59 


27 


6773+47 


274+42 


42.41+0.20 


HCG 97 



Note. — A'obs is the number of galaxies who redshifts were measured in this work; A''momb is the number we identify 
as group members; z is the mean group redshift, and az is the velocity dispersion in km s~^, with errors from standard 
bootstrap analysis. Lx is the 0.1-2.4 keV luminosity in /ij"QQ erg s~^as calculated in Mahdavi et al. (2000). 

'^The data for these groups appeared originally in Mahdavi et al. (1999); however, our algorithm for determining group 
membership differs from the one used in that work, and hence the total number of members varies slightly. Also, A'obs 
in our table differs from iVrotai in Mahdavi et al. (1999), because we list the actual number of galaxies observed, rather 
than the number in the field brighter than a given magnitude. 

''Because of their large velocity dispersions, these systems were grouped with the rich clusters in Table 3 for our 
analysis. 
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Table 2 

New Galaxy Positions and Velocities " 



Name 




a2ooo 






^2000 




cz 




d 


Member 


SRGb062.001 


00 


11 


57.0 


+29 


29 


08 


28038 


35 


1.808 


N 


SRGb062.002 


00 


12 


11.8 


+29 


19 


09 


7728 


21 


1.838 


N 


SRGb062.003 


00 


12 


17.5 


+29 


52 


17 


6921 


20 


1.595 


N 


SRGb062.004 


00 


12 


28.4 


+29 


32 


38 


6862 


15 


1.656 


N 


SRGb062.005 


00 


12 


38.3 


+30 


06 


08 


6791 


23 


1.487 


M 


SRGb062.006 


00 


12 


45.0 


+29 


22 


15 


10419 


15 


1.683 


N 


SRGb062.007 


00 


13 


12.7 


+31 


08 


43 


14465 


28 


1.846 


N 


SRGb062.008 


00 


13 


45.1 


+30 


11 


40 


7109 


28 


1.209 


M 



Note. — cz is the galaxy redshift times the speed of hght in km s~^, Ccz is the 
measurement error, and d is the distance from the X-ray center in Mpc assuming Hq 
= 100 km s^^ Mpc^^. Members are indicated by an "M" in the Member column; 

nonmcmbcrs arc indicated by an "N." 

''This table is for illustrative purposes only. The complete data appear in the 
electronic version of the Journal. 



Table 3 



Cluster Sample 



Name 


o;20oo 


(52000 


memb 


cz 




log Lx 


References 




Abell 262 


01:52:50.4 


36:08:46 


103 


4920±55 


560±41 


43.14 


0.58 


G97,N01 


Abell 3158 


03:42:39.6 


-53:37:50 


118 


17740±91 


964±56 


44.12 


1.70 


K01,G96 


Abell 496 


04:33:37.1 


-13:14:46 


227 


9920±53 


729±37 


43.95 


1.17 


D00,M99 


Abell 1644 


12:57:14.8 


-17:21:13 


85 


14210±100 


947±73 


43.94 




G97 


Abell 1795 


13:49:00.5 


26:35:07 


84 


18880±95 


879±77 


44.44 


1.76 


G97 


Abell 1809 


13:53:18.9 


05:09:15 


61 


23750±100 


747±67 


43.60 




G97 


Abell 3571 


13:47:28.9 


-32:51:57 


90 


11700±101 


969±69 


44.26 


1.72 


G96 


Abell 2029 


15:10:58.7 


05:45:42 


87 


23140±140 


1235±78 


44.58 


2.20 


095,894 



Note. — A^mcmb IS thc number wo identify as cluster members; z is the mean cluster redshift, and Uz is the 
velocity dispersion of our derived membership in km s^^, with errors from standard bootstrap analysis. Lx 
is the total 0.1-2.4 kcV luminosity in h^^f^, erg s^^from Ebcling ot al. (1996), the typical error is 20%. Thc 
references are to X-ray and optical studies of the clusters that suggest they could be dynamically relaxed. C97: 
Cirimele et al. (1997); DOO: Durret et al. (2000); G96: Girardi et al. (1996); G97: Girardi et al. (1997); KOI: 
Kolokotronis et al. (2001); 095: Oegerle et al. (1995); NOl: Neill et al. (2001); S94: Slezak et al. (1994). 



Table 4 

Simulations of the Measurement Technique 



ID 




P 






n 








M200 




Pi 


Fit 
Quality 


in 


Out 




in 


Out 


in 


Out 


in 


Out 


in 


Out 


A 


0.0 


-0.3 - 


0.3 


1.0 


0.6-1.3 


1.0 


0.8-1.2 


0.3 


0.27-0.35 


0.05 


0.04-0.07 


0.96 


B 


0.5 


-0.2 - 


0.7 


2.0 


1.9-2.3 


1.5 


1.2-1.8 


3 


1.3-5.0 


0.1 


0.08-0.15 


0.75 


C 


-0.5 


-1.0 - ■ 


-0.3 


0.0 


0.0-1.3 


2.0 


1.8-2.2 


1 


0.6-2.0 


0.01 


0.0-0.03 


0.20 


D 


0.0 


-0.5 - 


0.2 


0.0 


0.0-0.7 


1.0 


0.7-1.0 


0.4 


0.3-0.4 


0.03 


0.01-0.04 


0.34 



Note. — Simulations of 893 galaxies carried out with 7 = 8. "In" refers to the input parameter set; "Out" shows the 
recovered ID 95% marginalized confidence interval for each parameter. 



Shape of the Matter Density 



Table 5 
Summary of Results 



Low-cT High-cr 



General Statistics 

No. of systems 33 15 

Completeness limit l-8r2oo 0.8r2oo 

No. of members within 0.8r2oo 521 945 

No. of members within 1.8t'2oo 893 

Range in (km/s) 110-470 470-1240 

Gaussian fit to jv = 22/27 x^/i^ = 32/28 

KS Test inR" d = 0.106, p = 10"^ 

KS Test inv^ d = 0.028, p = 0.87 

Likelihood of Constant Density Core (n = 0) 

Fit Quality, /3 < q = 0.003 q < 10"^ 

Fit Quality, /3 free q = 0.02 q = 0.003 

Likelihood of Free n 

Minimum 7 8 12 

Fit Quality q = 0.169 q = 0.124 

n 1.8-2.2 1.6-2.0 

P -0.37 - 0.42 0.02 - 0.67 

r5/2/'r2oo 3-12 5.8-26 

GM2oo/aVo 2.2-2.7 2.2-3.0 

Pi 0.01-0.05 0.01-0.02 



^Conducted only for galaxies with R < 0.8r2oo. 
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APPENDIX 

THE DISTRIBUTION FUNCTION INTEGRAL 



In the statistical interpretation of the distribution function (DF), the probability of observing a galaxy with a line-of- 
sight peculiar velocity at a projected distance R from the cluster center in a survey of Nq galaxies within a limiting 
radius i?iim is (Merritt & Saha 1993; van der Marel et al. 2000) 

p'{R,v,) = ^J J J fi£,L')dv^dvHdz, (Al) 

where (;>'n,V0,Vz) are the components of the galaxy's peculiar motion in cylindrical coordinates, £ = ^'(r) — L is the 
angular momentum, and / is the distribution function (DF). The probability distribution is normalized such that 

/ / p'{R,v,\a)dv,dR= —i:{R)dR=l, (A2) 

where T,{R) is the surface number density of galaxies. We use a DF of the form 

f{£,L^) = fo£''-'/^L-^<' (A3) 

where /o is fixed by the normalization requirement in equation (A2). The limits of the integration are determined as 
follows: (1) a galaxy bound to the cluster must always have vjj^ + v'^ + v^ < 2^", and (2) a galaxy with peculiar velocity Vz 

has an associated maximum radius rmax such that = 2^'(rmax), beyond which it is unbound. With these conditions, 
and switching to the coordinate system vr = wsint], = wcost], we have 

p'{R,Vz) = —^ drjx (A4) 



No Jr Vr2 - i?2 



= r^vP' cos^ 7] + r'^ {w sin rj cos 9 — Vz sin 0)^ , (A5) 
R 

sin 6*=—. (A6) 
r 

For /3 > 0, the above expression has an integrable singularity at L = 0, or [rj = Tr/2,w = ti^tan^). Changing variables 
from (?7, w) to {Q, Li^) is tempting, but the integration boundaries in those coordinates are not necessarily analytic. It is 
best to work in the (r/, w) plane and take advantage of the fact that, for Gaussian quadrature, the innermost integrand 
will never be evaluated at the singular point 77 = 7r/2, and so will always be finite at w; = Uztan^. We find that making 
the change of variables appropriate for an inverse square-root singularity inw(ijj = tip') (Press et al. 1992) yields accurate 
results for the innermost integral. As for the middle, 77 integral, it may be evaluated using Gaussian quadrature (Press 
et al. 1992) with a constant weighting function for /? < 1/2. For /? > 1/2, Gaussian quadrature, with (7r/2 — 77)^"^^ as 
the weighting function, is best. 

The galaxy number density u{r) corresponding to the chosen DF is given by an integral over all velocities; this integral 
is discussed in detail by Cuddeford (1991) and Merritt (1985). 

v{r)=[ f{£,L^)d\ (A7) 

r Jo Jo ^2r^-9-£)~L^ 

= 23/^-V/Vo ^^"r|a-Tr2y^^ ""'"^^^^""'^'- ^^^^ 



